AITopics | unaligned data

Collaborating Authors

unaligned data

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards Multimodal Active Learning: Efficient Learning with Limited Paired Data

Zhang, Jiancheng, Zhu, Yinglun

arXiv.org Artificial IntelligenceOct-7-2025

Active learning (AL) is a principled strategy to reduce annotation cost in data-hungry deep learning. However, existing AL algorithms focus almost exclusively on unimodal data, overlooking the substantial annotation burden in multimodal learning. We introduce the first framework for multimodal active learning with unaligned data, where the learner must actively acquire cross-modal alignments rather than labels on pre-aligned pairs. This setting captures the practical bottleneck in modern multimodal pipelines such as CLIP and SigLIP, where unimodal features are easy to obtain but high-quality alignment is costly. We develop a new algorithm that combines uncertainty and diversity principles in a modality-aware design, achieves linear-time acquisition, and applies seamlessly to both pool-based and streaming-based settings. Extensive experiments on benchmark datasets demonstrate that our approach consistently reduces multimodal annotation cost while preserving performance; for instance, on the ColorSwap dataset it cuts annotation requirements by up to $40\%$ without loss in accuracy.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2510.03247

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

FedUD: Exploiting Unaligned Data for Cross-Platform Federated Click-Through Rate Prediction

Ouyang, Wentao, Dong, Rui, Tao, Ri, Liu, Xiangzheng

arXiv.org Artificial IntelligenceJul-25-2024

Click-through rate (CTR) prediction plays an important role in online advertising platforms. Most existing methods use data from the advertising platform itself for CTR prediction. As user behaviors also exist on many other platforms, e.g., media platforms, it is beneficial to further exploit such complementary information for better modeling user interest and for improving CTR prediction performance. However, due to privacy concerns, data from different platforms cannot be uploaded to a server for centralized model training. Vertical federated learning (VFL) provides a possible solution which is able to keep the raw data on respective participating parties and learn a collaborative model in a privacy-preserving way. However, traditional VFL methods only utilize aligned data with common keys across parties, which strongly restricts their application scope. In this paper, we propose FedUD, which is able to exploit unaligned data, in addition to aligned data, for more accurate federated CTR prediction. FedUD contains two steps. In the first step, FedUD utilizes aligned data across parties like traditional VFL, but it additionally includes a knowledge distillation module. This module distills useful knowledge from the guest party's high-level representations and guides the learning of a representation transfer network. In the second step, FedUD applies the learned knowledge to enrich the representations of the host party's unaligned data such that both aligned and unaligned data can contribute to federated model training. Experiments on two real-world datasets demonstrate the superior performance of FedUD for federated CTR prediction.

guest party, host party, unaligned data, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3626772.3657941

2407.18472

Country:

North America > United States > District of Columbia > Washington (0.05)
Asia > China > Beijing > Beijing (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Data Science > Data Mining > Big Data (0.49)

Add feedback

Emerging Safety Attack and Defense in Federated Instruction Tuning of Large Language Models

Ye, Rui, Chai, Jingyi, Liu, Xiangrui, Yang, Yaodong, Wang, Yanfeng, Chen, Siheng

arXiv.org Artificial IntelligenceJun-15-2024

Federated learning (FL) enables multiple parties to collaboratively fine-tune an large language model (LLM) without the need of direct data sharing. Ideally, by training on decentralized data that is aligned with human preferences and safety principles, federated instruction tuning can result in an LLM that could behave in a helpful and safe manner. In this paper, we for the first time reveal the vulnerability of safety alignment in FedIT by proposing a simple, stealthy, yet effective safety attack method. Specifically, the malicious clients could automatically generate attack data without involving manual efforts and attack the FedIT system by training their local LLMs on such attack data. Unfortunately, this proposed safety attack not only can compromise the safety alignment of LLM trained via FedIT, but also can not be effectively defended against by many existing FL defense methods. Targeting this, we further propose a post-hoc defense method, which could rely on a fully automated pipeline: generation of defense data and further fine-tuning of the LLM. Extensive experiments show that our safety attack method can significantly compromise the LLM's safety alignment (e.g., reduce safety rate by 70\%), which can not be effectively defended by existing defense methods (at most 4\% absolute improvement), while our safety defense method can significantly enhance the attacked LLM's safety alignment (at most 69\% absolute improvement).

defense method, instruction, llm, (13 more...)

arXiv.org Artificial Intelligence

2406.1063

Country:

Asia > China > Shanghai > Shanghai (0.04)
Europe > United Kingdom > England > Surrey > Guildford (0.04)
Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Government (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A hybrid Decoder-DeepONet operator regression framework for unaligned observation data

Chen, Bo, Wang, Chenyu, Li, Weipeng, Fu, Haiyang

arXiv.org Artificial IntelligenceAug-17-2023

Deep neural operators (DNOs) have been utilized to approximate nonlinear mappings between function spaces. However, DNOs face the challenge of increased dimensionality and computational cost associated with unaligned observation data. In this study, we propose a hybrid Decoder-DeepONet operator regression framework to handle unaligned data effectively. Additionally, we introduce a Multi-Decoder-DeepONet, which utilizes an average field of training data as input augmentation. The consistencies of the frameworks with the operator approximation theory are provided, on the basis of the universal approximation theorem. Two numerical experiments, Darcy problem and flow-field around an airfoil, are conducted to validate the efficiency and accuracy of the proposed methods. Results illustrate the advantages of Decoder-DeepONet and Multi-Decoder-DeepONet in handling unaligned observation data and showcase their potentials in improving prediction accuracy.

deeponet, multi-decoder-deeponet, unaligned data, (14 more...)

arXiv.org Artificial Intelligence

2308.09274

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)

Add feedback